Generalized Linear Mixed Model
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, a generalized linear mixed model (GLMM) is an extension to the
generalized linear model In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and b ...
(GLM) in which the linear predictor contains
random effects In statistics, a random effects model, also called a variance components model, is a statistical model where the model parameters are random variables. It is a kind of hierarchical linear model, which assumes that the data being analysed are ...
in addition to the usual
fixed effects In statistics, a fixed effects model is a statistical model in which the model parameters are fixed or non-random quantities. This is in contrast to random effects models and mixed models in which all or some of the model parameters are random va ...
. They also inherit from GLMs the idea of extending linear mixed models to non-
normal Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...
data. GLMMs provide a broad range of models for the analysis of grouped data, since the differences between groups can be modelled as a random effect. These models are useful in the analysis of many kinds of data, including
longitudinal data In statistics and econometrics, panel data and longitudinal data are both multi-dimensional data involving measurements over time. Panel data is a subset of longitudinal data where observations are for the same subjects each time. Time series and ...
.


Model

GLMMs are generally defined such that, conditioned on the random effects u, the dependent variable y is distributed according to the
exponential family In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate ...
with its expectation related to the linear predictor X\beta+Zu via a link function g: :g(E \vert u=X\beta+Zu. Here X and \beta are the fixed effects design matrix, and fixed effects respectively; Z and u are the random effects design matrix and random effects respectively. To understand this very brief definition you will first need to understand the definition of a
generalized linear model In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and b ...
and of a
mixed model A mixed model, mixed-effects model or mixed error-component model is a statistical model containing both fixed effects and random effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences. ...
. Generalized linear mixed models are a special cases of hierarchical generalized linear models in which the random effects are normally distributed. The complete likelihood :\ln(y)=\ln\int p(y\vert u)p(u)du has no general closed form, and integrating over the random effects is usually extremely computationally intensive. In addition to numerically approximating this integral(e.g. via
Gauss–Hermite quadrature In numerical analysis, Gauss–Hermite quadrature is a form of Gaussian quadrature for approximating the value of integrals of the following kind: :\int_^ e^ f(x)\,dx. In this case :\int_^ e^ f(x)\,dx \approx \sum_^n w_i f(x_i) where ''n'' is ...
), methods motivated by Laplace approximation have been proposed. For example, the penalized quasi-likelihood method, which essentially involves repeatedly fitting (i.e. doubly iterative) a weighted normal mixed model with a working variate, is implemented by various commercial and open source statistical programs.


Fitting a model

Fitting GLMMs via
maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimation theory, estimating the Statistical parameter, parameters of an assumed probability distribution, given some observed data. This is achieved by Mathematical optimization, ...
(as via
AIC AIC may refer to: Arts and entertainment * Alice in Chains, American rock band * Alice in Chains: AIC 23, a 2013 mockumentary * Anime International Company, a Japanese animation studio * Art Institute of Chicago, an art museum in Chicago Busin ...
) involves integrating over the random effects. In general, those integrals cannot be expressed in analytical form. Various approximate methods have been developed, but none has good properties for all possible models and
data set A data set (or dataset) is a collection of data. In the case of tabular data, a data set corresponds to one or more database tables, where every column of a table represents a particular variable, and each row corresponds to a given record of the ...
s (e.g. ungrouped
binary data Binary data is data whose unit can take on only two possible states. These are often labelled as 0 and 1 in accordance with the binary numeral system and Boolean algebra. Binary data occurs in many different technical and scientific fields, wher ...
are particularly problematic). For this reason, methods involving
numerical quadrature In analysis, numerical integration comprises a broad family of algorithms for calculating the numerical value of a definite integral, and by extension, the term is also sometimes used to describe the numerical solution of differential equations ...
or
Markov chain Monte Carlo In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain ...
have increased in use, as increasing computing power and advances in methods have made them more practical. The
Akaike information criterion The Akaike information criterion (AIC) is an estimator of prediction error and thereby relative quality of statistical models for a given set of data. Given a collection of models for the data, AIC estimates the quality of each model, relative to e ...
(AIC) is a common criterion for
model selection Model selection is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre-existing set of data is considered. However, the task can also involve the design of experiments such that the ...
. Estimates of AIC for GLMMs based on certain
exponential family In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate ...
distributions have recently been obtained.


Software

* Several contributed packages in R provide GLMM functionality, including lme4 and glmm. * GLMM can be fitted using SAS and
SPSS SPSS Statistics is a statistical software suite developed by IBM for data management, advanced analytics, multivariate analysis, business intelligence, and criminal investigation. Long produced by SPSS Inc., it was acquired by IBM in 2009. C ...
*
MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation ...
also provides a function called "fitglme" to fit GLMM models. * The
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
package Statsmodels supports binomial and poisson implementation * The Julia package MixedModels.jl provides a function called GeneralizedLinearMixedModel that fits a GLMM to provided data. * DHARMa: residual diagnostics for hierarchical (multi-level/mixed) regression models (utk.edu)


See also

*
Generalized estimating equation In statistics, a generalized estimating equation (GEE) is used to estimate the parameters of a generalized linear model with a possible unmeasured correlation between observations from different timepoints. Although some believe that Generalized es ...
* Hierarchical generalized linear model


References

{{reflist Analysis of variance
Mixed model A mixed model, mixed-effects model or mixed error-component model is a statistical model containing both fixed effects and random effects. These models are useful in a wide variety of disciplines in the physical, biological and social sciences. ...